Polypeptides having granulocyte colony stimulating activity, their preparation and pharmaceutical compositions containing them

ABSTRACT

New polypeptides having granulocyte colony stimulating activity, preparation thereof and pharmaceutical compositions containing said polypeptides.

The present invention relates to new polypeptides having human granulocyte colony stimulating activity, to their preparation and to pharmaceutical compositions containing them.

BACKGROUND OF THE INVENTION

The present invention relates especially to chimeric polypeptides composed of a biologically active portion consisting of all or part of G-CSF or of a variant of G-CSF, and an essentially proteinaceous stabilizing structure endowing it with new biological properties.

Human G-CSF is a secreted polypeptide of 174 amino acids having a molecular weight of approximately 18 kD. It was isolated initially from a cancer cell line (EP 169,566), and its gene has been cloned, sequenced and expressed in different cell hosts by genetic engineering techniques (EP 215,126, EP 220,520). An mRNA potentially coding for a form of G-CSF having 177 amino adds has, moreover, been detected [Nagata S. et al., EMBO J. 5 (1986) 575-581]. G-CSF possesses the capacity to stimulate the differentiation and proliferation of bone marrow stem cells to granulocytes. As such, it possesses the capacity to stimulate the body's protective capacities against infection by promoting the growth of polymorphonuclear neutrophils and their differentiation ending in maturity. It is thus capable of activating the body's prophylactic functions, and may be used in different pathological situations in which the number of neutrophils is abnormally low or in which the immune system needs to be strengthened. Such situations arise, for example, following cancer chemotherapy treatments, in transplantation, and especially bone marrow transplantation, or in leukopenic states.

One of the drawbacks of currently available G-CSF lies in the fact that it is rapidly degraded by the body once administered. This is all the more noticeable for the fact that G-CSF is generally used at low doses. Furthermore, the use of larger doses has not been able to permit therapeutic capacities of this molecule to be improved, and may induce adverse side effects. These phenomena of elimination and degradation in vivo hence constitute at present an obstacle to exploitation of the biological activity of the G-CSF as a pharmaceutical agent.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

The present invention enables these drawbacks to be remedied. The present invention provides, in effect, new molecules enabling the biological properties of G-CSF to be optimally exploited from a therapeutic standpoint. The Applicant demonstrated, in effect, that optimal G-CSF activity was manifested when the G-CSF was present at a low dose and for a prolonged time. The Applicant has now produced molecules capable of maintaining G-CSF activity in the body for a sufficiently long time. Furthermore, the Applicant has shown that it is possible to express, in cell hosts at high levels, genetic fusion generating chimeras possessing new pharmacokinetic properties and the desirable biological properties of G-CSF. In particular, hybrid polypeptides of the invention retain their affinity for G-CSF receptors, and are sufficiently functional to lead to proliferation and to cell differentiation. The molecules of the invention possess, moreover, a distribution and pharmacokinetic properties which are especially advantageous in the body, and enable their biological activity to be developed therapeutically.

A subject of the present invention hence relates to recombinant polypeptides containing an active portion consisting of all or part of G-CSF, or of a variant of G-CSF, and an essentially proteinaceous stabilizing structure.

For the purposes of the present invention, the term variant of G-CSF denotes any molecule obtained by modification of the sequence lying between the residues Thr586 and Pro759 of the sequence presented in FIG. 1, retaining G-CSF activity, that is to say the capacity to stimulate the differentiation of target cells and the formation of granulocyte colonies. This sequence corresponds to that of mature G-CSF described by Nagata et al. [EMBO J. 5 (1986) 575-581 ]. Modification is understood to mean any mutation, substitution, deletion, addition or modification resulting from an action of a genetic and/or chemical nature. Such variants may be generated for different purposes, such as, in particular, that of increasing the affinity of the molecule for the G-CSF receptor(s), that of improving its levels of production, that of increasing its resistance to proteases, that of increasing its therapeutic efficacy or of reducing its side effects, or that of endowing it with new pharmacokinetic and/or biological properties.

Especially advantageous polypeptides of the invention are those in which the biologically active portion possesses:

(a) the peptide sequence lying between the residues Thr586 and Pro759 of the sequence presented in FIG. 1, or

(b) a portion of the structure (a), or

(c) a structure derived from the structures (a) or (b) by structural modifications (mutation, addition substitution and/or deletion of one or more residues), and having an identical or modified biological activity. This latter type of polypeptide comprises, for example, molecules in which some glycosylation sites have been modified or eliminated, as well as molecules in which one, several or even all the cysteine residues have been substituted. It also comprises molecules obtained from (a) or (b) by deletion of regions that display little or no participation in the activity or which participate in an undesirable activity, and molecules containing, relative to (a) or (b), additional residues such as, for example, an N-terminal methionine or a secretion signal.

More preferably, the chimeric polypeptides of the invention comprise an active portion of type (a).

The active portion of the molecules of the invention may be coupled to the proteinaceous stabilizing structure either directly or via a peptide linker. Furthermore, it can constitute the N-terminal end or the C-terminal end of the molecule. Preferably, in the molecules of the invention, the active portion constitutes the C-terminal portion of the chimera.

As stated above, the stabilizing structure of the polypeptides of the invention is essentially proteinaceous.

Preferably, this structure is a polypeptide possessing a long plasma half-life. As an example, it can be an albumin, an apolipoprotein, an immunoglobulin or alternatively a transferrin. Appropriate peptides can also be ones derived from such proteins by structural modifications, or artificially or semi-artificially synthesized peptides possessing a long plasma half-life. Moreover, the stabilizing structure used is, more preferably, a polypeptide which is weakly immunogenic or non-immunogenic for the organism in which the polypeptides of the invention are used.

In an especially advantageous embodiment of the invention, the stabilizing structure is an albumin or a variant of albumin, and for example human serum albumin (HSA). It is understood that variants of albumin denote any protein having a long plasma half-life obtained by modification (mutation, deletion and/or addition) by genetic engineering techniques of a gene coding for a given isomorph of human serum albumin, as well as any macromolecule having a long plasma half-life obtained by in vitro modification of the protein encoded by such genes. Since albumin is very polymorphic, many natural variants have already been identified, and more than 30 different genetic types have been listed [Weitkamp L. R. et al., Ann. Hum. Genet. 37 (1973) 219]. More preferably, the stabilizing structure is a mature albumin.

As examples, there may be mentioned polypeptides of the invention containing, in the N-terminal→C-terminal direction, (i) the mature HSA sequence coupled directly to the mature G-CSF sequence (see FIG. 1), or (ii) the mature G-CSF sequence coupled via a peptide linker to the mature HSA sequence.

Another subject of the invention relates to a method for preparing the chimeric molecules described above. More specifically, this method consists in causing a eukaryotic or prokaryotic cell host to express a nucleotide sequence coding for the desired polypeptide, and in then harvesting the polypeptide produced.

Among eukaryotic hosts which are usable in the context of the present invention, animal cells, yeast or fungi may be mentioned. In particular, as regards yeasts, yeasts of the genus Saccharomyces, Kluyveromyces, Pichia, Schwanniomyces or Hansenula may be mentioned. As regards animal cells, COS, CHO, C127, and the like, cells may be mentioned. Among fungi capable of being used in the present invention, Aspergillus ssp. or Trichoderma ssp. may be mentioned more especially. As prokaryotic hosts, it is preferable to use bacteria such as Escherichia coil, or ones belonging to the genera Corynebacterium, Bacillus or Streptomyces.

The nucleotide sequences which are usable in the context of the present invention may be prepared in different ways. Generally, they are obtained by assembling in a reading frame the sequences coding for each of the functional portions of the polypeptide. These may be isolated by the techniques of a person skilled in the art, and, for example, directly from the cellular messenger RNAs (mRNAs), or by recloning from a library of complementary DNA (cDNA) isolated from cells that make the product, or alternatively the nucleotide sequences in question may be completely synthetic ones. It is understood, furthermore, that the nucleotide sequences may also be modified subsequently, for example by genetic engineering techniques, to obtain derivatives or variants of the said sequences.

More preferably, in the method of the invention, the nucleotide sequence forms part of an expression cassette comprising a transcription initiation region (promoter region) permitting, in the host cells, expression of the nucleotide sequence placed under its control and coding for the polypeptides of the invention. This region can originate from promoter regions of genes which are strongly expressed in the host cell used, the expression being constitutive or regulable. As regards yeasts, the promoter can be that of the gene for phosphoglycerate kinase (PGK); for glyceraldehyde-3-phosphate dehydrogenase (GPD), for lactase (LAC4), for enolases (ENO), for alcohol dehydrogenases (ADH), and the like. As regards bacteria, the promoter can be that of the right or left genes of bacteriophage lambda (P_(L), P_(R)), or alternatively those of the genes of the tryptophan (P_(trp)) or lactose (P_(lac)) operons. In addition, this control region may be modified, for example by in vitro mutagenesis, by the introduction of additional control elements or of synthetic sequences or by deletions or substitutions of the original control elements. The expression cassette can also comprise a transcription termination region which is functional in the host envisaged, positioned immediately downstream of the nucleotide sequence coding for a polypeptide of the invention.

In a preferred embodiment, the polypeptides of the invention result from the expression of a nucleotide sequence in a eukaryotic or prokaryotic host and the secretion of the expression product of the said sequence into the culture medium. It is, in effect, especially advantageous to be able to obtain molecules directly in the culture medium using recombinant methods. In this case, the nucleotide sequence coding for a polypeptide of the invention is preceded by a leader sequence (or signal sequence) directing the nascent polypeptide into the pathways of secretion of the host used. This leader sequence can be the natural signal sequence of G-CSF or of the stabilizing structure in the case where the latter is a naturally secreted protein, but it can also be any other functional leader sequence, or an artificial leader sequence. The choice of one or other of these sequences is, in particular, guided by the host used. Examples of functional signal sequences include those of the genes for the sex pheromones or for the killer toxins of yeasts.

In addition to the expression cassette, one or more markers enabling the recombinant host to be selected may be added, such as, for example, the URA3 gene of S. cerevisiae yeast, or genes conferring resistance to antibiotics such as geneticin (G418) or to any other toxic compound such as certain metal ions.

The assembly consisting of the expression cassette and the selectable marker may be introduced, either directly into the host cells in question, or inserted beforehand into a functional self-replicating vector. In the first case, sequences homologous With regions present in the genome of the host cells are preferably added to this assembly; the said sequences then being positioned on each side of the expression cassette and the selectable gene so as to increase the frequency of integration of the assembly in the host's genome by targeting integration of the sequences by homologous recombination. In the case where the expression cassette is inserted into a replicating system, a preferred replication system for yeasts of the genus Kluyveromyces is derived from plasmid pKD1 initially isolated from K. drosophilarum; a preferred replication system for yeasts of the genus Saccharomyces is derived from the 2μ plasmid of S. cerevisiae. Furthermore, this expression plasmid can contain all or part of the said replication systems, or can combine elements derived from plasmid pKD1 as well as from the 2μ plasmid.

In addition, the expression plasmids can be shuttle vectors between a bacterial host such as Escherichia coil and the chosen host cell. In this case, an origin of replication and a selectable marker that function in the bacterial host are required. It is also possible to position restriction sites surrounding the bacterial and unique sequences on the expression vector: this enables the sequences to be eliminated by cutting and religation in vitro of the truncated vector before transformation of the host cells, which can result in an increase in copy number and in an enhanced stability of the expression plasmids in the said hosts. For example, such restriction sites can correspond to sequences such as 5'-GGCCNNNNNGGCC-3' (SEQ ID NO: 5, Sfil) or 5'-GCGGCCGC-3' (Notl), inasmuch as these sites are extremely rare and generally absent from an expression vector.

After the construction of such expression vectors or cassette, these are introduced into the selected host cells according to standard techniques described in the literature. In this connection, any method enabling a foreign DNA to be introduced into a cell may be used. This can comprise, in particular, transformation, electroporation, conjugation or any other technique known to a person skilled in the art. As an example for yeast type hosts, the different Kluyveromyces strains used have been transformed by treating the whole cells in the presence of lithium acetate and polyethylene glycol, according to a technique described by Ito et al. [J. Bacteriol. 153 (1983) 163]. The transformation technique described by Durrens et al. [Curr. Genet. 18 (1990) 7] using ethylene glycol and dimethyl sulphoxide has also been used. It is also possible to transform yeasts by electroporation, according to the method described by Karube et al. [FEBS Letters 182 (1985) 90]. An alternative protocol is also described in detail in the examples which follow.

After selection of the transformed cells, the cells expressing the said polypeptides are inoculated and recovery of the said polypeptides may be carried out, either during cell growth for "continuous" methods, or at the end of growth for "batch" cultures. The polypeptides which form the subject of the present invention are then purified from the culture supernatant for the purpose of their molecular, pharmacokinetic and biological characterization.

A preferred expression system for the polypeptides of the invention consists in using yeasts of the genus Kluyveromyces as host cell, the yeasts being transformed with certain vectors derived from the extrachromosomal replicon pKD1 initially isolated in K. marxianus var. drosophilarum. These yeasts, and especially K. lactis and K. fragilis, are generally capable of stably replicating the said vectors, and possess, in addition, the advantage of being included in the list of GRAS (Generally Recognized As Safe) organisms. Favoured yeasts are preferably industrial strains of the genus Kluyveromyces which are capable of stably replicating the said plasmids derived from plasmid pKD1, and into which has been inserted a selectable marker as well as an expression cassette permitting the secretion of the polypeptides of the invention at high levels.

The present invention also relates to the nucleotide sequences coding for the chimeric polypeptides described above, as well as the recombinant eukaryotic or prokaryotic cells comprising such sequences.

The present invention also relates to the application of the polypeptides according to the present invention as a medicinal product. More especially, the subject of the invention is any pharmaceutical composition comprising one or more polypeptides as described above. More especially, these compositions may be used in all pathological situations in which the number and/or activity of granulocytes need to be stimulated. In particular, they may be used for the prevention or treatment of leukopenic states or of some leukaemias or, in the case of transplantation or of cancer treatment, for strengthening or restoring the immune system.

The present invention will be described more fully by means of the examples which follow, which are to be considered as illustrative and non-limiting.

LIST OF FIGURES

The representations of the plasmids shown in the following figures are not drawn to scale, and only the restriction sites which are important for an understanding of the clonings carried out have been shown.

FIG. 1: Nucleotide sequence (SEQ ID NO: 1 ) and deduced amino acid sequence (SEQ ID NO: 2) of the Hindlll restriction fragment of plasmid pYG1259 (chimera prepro-HSA-G.CSF). The solid arrows indicate the end of the HSA "pre" and "pro" regions. The Mstll, Apal and SStl (Sacl) restriction sites are underlined. The G-CSF peptide sequence is in italics (Thr586>Pro759; the numbering of the amino acids corresponds to the mature chimeric protein).

FIG. 2: Diagrammatic representation of chimeras of the HSA-G.CSF type (A) and of the G.CSF-HSA (B) or G.CSF-HSA-G.CSF (C) type. Abbreviations used: M/LP, translation initiation methionine, where appropriate followed by a secretion signal sequence; HSA, mature human serum albumin or one of its variants; G.CSF, peptide derived from G-CSF and having an identical or modified activity. The solid arrow indicates the N-terminal end of the mature protein.

FIG. 3: Restriction map of plasmid pYG105, and strategy of construction of the plasmids for the expression of the chimeric proteins of the present invention. Abbreviations used: P, transcription promoter; T, transcription terminator; IR, inverted repeat sequences of plasmid pKD1; LP_(HSA), HSA "prepro" region; Ap^(r) and Km^(r) denote, respectively, the genes for resistance to ampicillin (E. coli) at to G418 (yeasts).

FIG. 4: Characterization of the material secreted after 4 days of culture (Erlenmeyers) of the strain CBS 293.91 transformed with plasmids pYG1266(plasmid for the expression of a chimera of the HSA-G.CSF type), and pKan707 (control plasmid). In this experiment, the results in diagrams A, B and C have been migrated on the same gel (SDS-PAGE 8.5%) and then treated separately.

A, Coomassie blue staining; molecular weight standard (lane 2); supernatant equivalent to 100 μl of the culture transformed with plasmids pKan707 in YPL medium (lane 1), or pYG1266 in YPD (lane 3) or YPL (lane 4) medium.

B, immunological characterization of the material secreted after the use of primary antibodies directed against human G-CSF: same legend as in A.

C, immunological characterization of the material secreted after the use of primary antibodies directed against human albumin: same legend as in A.

FIG. 5: Nucleotide sequence (SEQ ID NO: 3) and deduced amino acid sequence (SEQ ID NO: 4) of the Hindill restriction fragment of plasmid pYG1301 (chimera G.CSF-Gly₄ -HSA). The solid arrows indicate the end of the HSA "pre" and "pro" regions. The Apal, Sstl (Sacl) and Mstll restriction sites are underlined. The G.CSF (174 residues) and HSA (585 residues) domains are separated by the synthetic linker GGGG. The numbering of the amino acids corresponds to the mature chimeric G.CSF-Gly4-HSA protein (763 residues). The nucleotide sequence lying between the translation termination codon and the Hindlll site originates from HSA complementary DNA (cDNA) as described in Patent Application EP 361,991.

FIG. 6: Characterization of the material secreted after 4 days of culture (Erlenmeyers, in YPD medium) of the strain CBS 293.91 transformed with plasmids pYG1267 (chimera HSA-G.CSF), pYG1303 (chimera G.CSF-Gly₄ -HSA) and pYG1352 (chimera HSA-Gly₄ -G.CSF) after migration on SDS-PAGE 8.5% gel.

A, Coomassie blue staining; supernatant equivalent to 100 μl of the culture transformed with plasmids pYG1303 (lane 1 ), pYG1267 (lane 2) or pYG1352 (lane 3); molecular weight standard (lane 4).

B, immunological characterization of the material secreted after the use of primary antibodies directed against human G-CSF: same legend as in A.

FIG. 7: Activity with respect to in vitro cell proliferation of the murine line NFS60. The radioactivity ([³ H]thymidine) incorporated in the cell nuclei after 6 hours of incubation is shown as ordinates (cpm); the amount of product shown as abscissae is expressed in molarity (arbitrary units).

FIG. 8: Activity with respect to in vivo granulopoiesis in rats. The number of neutrophils (mean of 7 animals) is shown as ordinates as a function of time. The products tested are the chimera HSA-G.CSF (pYG1266, 4 or 40 mg/rat/day), reference G-CSF (10 mg/rat/day), recombinant HSA purified from Kluyveromyces lactis supernatant (rHSA, 30 mg/rat/day, see EP 361,991) or physiological saline.

EXAMPLES GENERAL CLONING TECHNIQUES

The methods traditionally used in molecular biology, such as preparative extractions of plasmid DNA, caesium chloride gradient centrifugation of plasmid DNA, agarose or acrylamide gel electrophoresis, purification of DNA fragments by electroelution, protein extraction with phenol or phenol/chloroform, ethanol or isopropanol precipitation of DNA in a saline medium, transformation in Escherichia coil, and the like, are well known to a person skilled in the art and are amply described in the literature [Maniatis T. et al., "Molecular Cloning, a Laboratory Manual", Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., 1982; Ausubel F. M. et al. (eds), "Current Protocols in Molecular Biology", John Wiley & Sons, New York, 1987].

Restriction enzymes were supplied by New England Biolabs (Biolabs), Bethesda Research Laboratories (BRL) or Amersham, and are used according to the suppliers' recommendations.

pBR322 and pUC type plasmids and phages of the M13 series are of commercial origin (Bethesda Research Laboratories).

For the ligations, the DNA fragments are separated according to their size by agarose or acrylamide gel electrophoresis, extracted with phenol or with a phenol/chloroform mixture, precipitated with ethanol and then incubated in the presence of phage T4 DNA ligase (Biolabs) according to the suppliers recommendations.

The filling in of 5' protruding ends is performed with the Klenow fragment of E. coli DNA polymerase I (Biolabs) according to the supplier's specifications. The destruction of 3' protruding ends is performed in the presence of phage T4 DNA polymerase (Biolabs) used according to the manufacturer's recommendations. The destruction of 5' protruding ends is performed by a controlled treatment with S1 nuclease.

Synthetic oligodeoxynucleotide-directed in vitro mutagenesis is performed according to the method developed by Taylor et el. [Nucleic Acids Res. 13 (1985) 8749-8764] using the kit distributed by Amersham.

The enzymatic amplification of DNA fragments by the so-called PCR [Polymerase-catalyzed Chain Reaction, Saiki R. K. et al., Science 230 (1985) 1350-1354; Mullis K. B. and Faloona F. A., Meth. Enzym. 155 (1987) 335-350] is performed using a DNA thermal cycler (Perkin Elmer Cetus) according to the manufacturer's specifications.

The verification of nucleotide sequences is performed by the method developed by Sanger et al. [Proc. Natl. Acad. Sci. USA, 74 (1977) 5463-5467] using the kit distributed by Amersham.

Transformations of K. lactis with the DNA of the plasmids for the expression of the proteins of the present invention are performed by any technique known to a person skilled in the art, an example of which is given in the text.

Except where otherwise stated, the bacterial strains used are E. coli MC1060 (laclPOZYA, X74, galU, galK, strA^(r)), or E. coli TG1 (lac, proA,B, supE, thi, hsdD5 /FtraD36, proA+B⁺, laclq, lacZ, M15).

The yeast strains used belong to budding yeasts, and more especially to yeasts of the genus Kluyveromyces. The strain K. lactis MW98-8C (a, uraA, arg, lys, K⁺, pKD1.sup.·) and K. lactis CBS 293.91 were used especially; a sample of the strain MW98-8C was deposited on 16th September, 1988 at the Centraalbureau voor Schimmelkulturen (CBS) in Baarn (Netherlands), where it was registered under the number CBS 579.88.

Yeast strains transformed with the expression plasmids coding for the proteins of the present invention are cultured in Erlenmeyers or in 21 pilot fermenters (SETRIC, France) at 28° C. in rich medium (YPD: 1% yeast extract, 2% Bactopeptone, 2% glucose; or YPL: 1% yeast extract, 2% Bactopeptone, 2% lactose) with constant stirring.

EXAMPLE 1: CONSTRUCTION OF AN MSTII-HINDIII RESTRICTION FRAGMENT INCLUDING THE MATURE PORTION OF HUMAN G-CSF

An Mstll-Hindlll restriction fragment including the mature form of human G-CSF is generated, for example, according to the following strategy: a Kpnl-Hindlll restriction fragment is first obtained by the PCR enzymatic amplification technique using the oligodeoxynucleotides Sq2291 (5'-CAAGGATCCAAGCTTCAGGGCTGCGCAAGGTGGCGTAG-3' (SEQ ID NO: 6), the Hindlll site is underlined) and Sq2292 (5'-CGGGGTACCTTAGGCTTAACCCCCCTGGGCCCTGCCAGC-3' (SEQ ID NO: 7), the Kpnl site is underlined) as primer on plasmid BBG13 serving as template. Plasmid BBG13 contains the gene coding for the B form (174 amino acids) of mature human G-CSF, obtained from Bdtish Bio-technology Limited, Oxford, England. The enzymatic amplification product of approximately 550 nucleotides is then digested with the restriction enzymes Kpnl and Hindlll and cloned into the vector pUC19 cut with the same enzymes, thereby generating the recombinant plasmid pYG1255. This plasmid is the source of an Mstll-Hindlll restriction fragment, the sequence of which is included in that of FIG. 1. An Mstli-Hindlll restriction fragment coding for the same polypeptide sequence may also be generated by the PCR amplification technique from the corresponding cDNAs, the sequence of which is known [Nagata S. et al., EMBO J. 5 (1986) 575-581 ]. These cDNAs may be isolated by the techniques of a person skilled in the art, for example using the kit distributed by Amersham, from a human cell line expressing G-CSF, and for example the human carcinoma cell line CHU-2 [Nagata et al., Nature 319 (1986) 415-418].

It can also be desirable to insert a peptide linker between the HSA portion and G-CSF, for example to permit a better functional presentation of the transducing portion. An Mstll-Hindlll restriction fragment is, for example, generated by substitution of the Mstll-Apal fragment of FIG. 1 by the oligodeoxynucleotides Sq2742 (5'-TTAGGCTTAGGTGGTGGCGGT ACCCCCCTGGGCC-3' (SEQ ID NO: 8), the codons coding for the glycine residues of this particular linker are underlined) and Sq2741 (5'-CAGGGGGGTACCGCCACCACCTAAGCC-3' (SEQ ID NO: 9)), which form, on pairing, an Mstll-Apal fragment. Plasmid pYG¹³³⁶ thus generated hence contains an Mstll-Hindlll restriction fragment, the sequence of which is identical to that of FIG. 1 except for the Mstll-Apal fragment.

EXAMPLE 2: TRANSLATIONALLY IN-FRAME FUSIONS BETWEEN HSA AND HUMAN G-CSF

E.2.1. Translational fusion of the HSA-G.CSF type

Plasmid pYG404 is described in Patent Application EP 361,991. This plasmid contains a Hindlll restriction fragment coding for the prepro-HSA gene preceded by the 21 nucleotides naturally present immediately upstream of the translation initiation ATG of the PGK gene of S. cerevisiae. More especially; this fragment contains a Hindlll-Mstll restriction fragment corresponding to the whole of the gene coding for prepro-HSA except for the three most C-terminal amino acids (leucine-glycine-leucine residues). Ligation of this fragment with the Mstll-Hindlll fragment of plasmid pYG1255 makes it possible to generate the Hindlll fragment of plasmid pYG1259 which codes for a chimeric protein in which the B form of mature G-CSF is positioned by genetic coupling translationally in-frame at the C-terminal end of the HSA molecule. The nucleotide sequence of this restriction fragment is given in FIG. 1, together with the polypeptide sequence of the corresponding chimera (HSA-G.CSF, see FIG. 2, diagram A).

A Hindlll restriction fragment Which is identical except for the Mstll-Apal fragment may also be readily generated, and which codes for a chimeric protein in which the B form of mature G-CSF is positioned by genetic coupling translationally in-frame at the C-terminal end of the HSA molecule and of a particular peptide linker. For example, this linker consists of 4 glycine residues in the Hindlll fragment of plasmid pYG1336 (chimera HSA-Gly₄ -G.CSF, see FIG. 2, diagram A).

E.2.2. Translational fusion of the G.CSF-HSA type

In a particular embodiment, the combined techniques of directed mutagenesis and PCR amplification make it possible to construct hybrid genes coding for a chimeric protein (FIG. 2, diagram B) resulting from the translational coupling between a signal peptide (and for example the HSA prepro region), a sequence including a gene having G-CSF activity and the mature form of HSA or one of its molecular variants. These hybrid genes are preferably flanked at the 5' end of the translation initiation ATG and at the 3' end of the translation termination codon by Hindlll restriction sites. For example, the oligodeoxynucleotide Sq2369 (5'-GTTCTACGCCACCTTGCGC AGCCCGGTGGAGGCGGTGATGCACACAAGAGTGAGGTTGCTCATCGG-3' (SEQ ID NO: 10), the underlined residues (optional) correspond in this particular chimera to a peptide linker composed of 4 glycine residues) enables the mature form of human G-CSF of plasmid BBG13 to be placed by directed mutagenesis immediately upstream of the mature form of HSA, thereby generating intermediate plasmid A. Similarly, the use of the oligodeoxynucleotide Sq2338 [5'-CAGGGAGCTGGCAGGGCCCAGGGGGTTCGACGAAACACACCCCTGGAATAAGCCGAGCT-3' (SEQ ID NO: 11, non-coding strand), the nucleotides complementary to the nucleotides coding for the first N-terminal residues of the mature form of human G-CSF are underlined] enables the HSA prepro region to be coupled by directed mutagenesis in the translational reading frame immediately upstream of the mature form of human G-CSF, thereby generating intermediate plasmid B. The Hindlll fragment of FIG. 5 is then generated by combining the Hindlll-Sstl fragment of plasmid B (junction of HSA prepro region+N-terminal fragment of mature GCSF, with the Sstl-Hindlll fragment of plasmid A [mature G-CSF-(glycine)_(x4) -mature HSA junction]. Plasmid pYG1301 contains this particular Hindlll restriction fragment coding for the chimera G.CSF-Gly₄ -HSA fused immediately downstream of the HSA prepro region.

E.2.3. Translational fusion of the G.CSF-HSA-G.CSF type

These same techniques of directed mutagenesis and DNA amplification in vitro enable hybrid genes to be constructed in which a sequence coding for G-CSF activity is coupled to the N- and C-terminal ends of HSA or one of its molecular variants (FIG. 2, diagram C). These hybrid genes are preferably flanked at the 5' end of the translation initiation ATG and at the 3' end of the translation termination codon by Hindlll restriction sites.

EXAMPLE 3: CONSTRUCTION OF EXPRESSION PLASMIDS

The chimeric proteins of the preceding examples may be expressed in yeasts from regulable or constitutive functional promoters such as, for example, those present in plasmids pYG105 (LAC4 promoter of Kluyveromyces lactis), pYG106 (PGK promoter of Saccharomyces cerevisiae) and pYG536 (PHO5 promoter of S. cerevisiae), or hybrid promoters such as those carried by the plasmids described in Patent Application EP 361,991.

For example, the Hindlll restriction fragment of pYG1259 is cloned in the productive orientation into the HindIll restriction site of the expression plasmid pYG105, thereby generating the expression plasmid pYG1266 (FIG. 3). Plasmid pYG105 corresponds to plasmid pKan707 described in Patent Application EP 361,991 in which the Hindlll restriction site has been destroyed by directed mutagenesis (oligodeoxynucleotide Sq1053: 5'-GAAATGCATAAGCTCTTGCCATTCTCACCG-3' (SEQ ID NO: 12)), and the Sall-Sacl fragment of which coding for the URA3 gene has been replaced by a Sall-Sacl restriction fragment containing the LAC4 promoter (in the form of a Sall-Hindlll fragment) and the terminator of the PGK gene of S. cerevisiae (in the form of a Hindlll-Sacl fragment). Plasmid pYG105 is mitotically very stable in the absence of geneticin (G418), and enables the chimetic protein to be expressed from the LAC4 promoter of K. lactis, in particular when the carbon source is lactose. In another exemplification, cloning of the Hindlll restriction fragment of plasmid pYG1259 in the productive orientation into the Hindlll site of plasmid pYG106 generates the expression plasmid pYG1267. Plasmids pYG1266 and pYG1267 are isogenic with one another, except for the Sall-Hindlll restriction fragment coding for the LAC4 promoter of K. lactis (plasmid pYG1266) or the PGK promoter of S. cerevisiae (plasmid pYG1267).

In another exemplification, cloning of the Hindlll restriction fragment of plasmid pYG1336 (chimera HSA-Gly₄ -G.CSF, see E.2.1.) in the productive orientation into the Hindlll site of plasmids pYG105 and pYG106 generates the expression plasmids pYG1351 and pYG1352, respectively.

Likewise, cloning of the Hindlll restriction fragment of plasmid pYG1301 (chimera G.CSF-Gly₄ -HSA, see E.2.2.) in the productive orientation into the Hindlll site of plasmids pYG105 and pYG106 generates the expression plasmids pYG1302 and pYG1303, respectively.

EXAMPLE 4: TRANSFORMATION OF YEASTS

Transformation of yeasts belonging to the genus Kluyveromyces, and especially K. lactis strains MW98-8C and CBS 293.91, is performed, for example, by the technique of treatment of whole cells with lithium acetate [Ito H: et al., J. Bacteriol. 153 (1983) 163-168], adapted as follows. Cell growth takes place at 28° C. in 50 ml of YPD medium, with stirring and to an optical density at 600 nm (OD₆₀₀) of between 0.6 and 0.8; the cells are harvested by low speed centrifugation, washed in sterile TE solution (10 mM Tris-HCI pH 7.4; 1 mM EDTA), resuspended in 3-4 ml of lithium acetate (0.1M in TE) to obtain a cell density of approximately 2×10⁸ cells/ml, and then incubated at 30° C. for 1 hour with moderate stirring. 0.1 ml aliquots of the resulting suspension of competent cells are incubated at 30° C. for 1 hour in the presence of DNA and at a final concentration of 35% of polyethylene glycol (PEG₄₀₀₀, Sigma). After a 5-minute thermal shock at 42° C., the cells are washed twice, resuspended in 0.2 ml of sterile water and incubated for 16 hours at 28° C. in 2 ml of YPD medium to permit the phenotypic expression of the ORFI-APH fusion expressed under the control of the P_(k1) promoter; 200 μl of the cell suspension are then plated out on selective YPD dishes (G418, 200 μg/ml). The dishes are incubated at 28° C. and the transformants appear after 2 to 3 days of cell growth.

EXAMPLE 5: SECRETION OF CHIMERAS

After selection on rich medium supplemented with G418, the recombinant clones are tested for their capacity to secrete the mature form of the proteins which are chimeras between HSA and G-CSF. A few clones corresponding to K. lactis strain CBS 293.91 transformed with plasmids pYG1266 or pYG1267 (HSA-G.CSF), pYG1302 or pYG1303 (G.CSF-Gly₄ -HSA) or alternatively pYG1351 or pYG1352 (HSA-Gly4-G.CSF) are incubated in selective complete liquid medium at 28° C. The cell supernatants are then tested after electrophoresis on 8.5% acrylamide gel, either directly by staining the acrylamide gel with Coomassie blue (FIG. 4, diagram A), or after immunoblotting using as primary antibodies rabbit polyclonal antibodies directed specifically against human G-CSF or against HSA. In the immunological detection experiments, the nitrocellulose filter is first incubated in the presence of the specific antibody, washed several times, incubated in the presence of biotinylated goat anti-rabbit antibodies and then incubated in the presence of an avidin-peroxidase complex using the "ABC kit" distributed by Vectastain (Biosys S. A., Compiegne, France). The immunological reaction is then visualized by adding 3,3-diaminobenzidine tetrahydrochloride (Prolabo) in the presence of hydrogen peroxide, according to the supplier's recommendations. The results in FIG. 4 demonstrate that the HSA-G.CSF hybrid protein is recognized both by antibodies directed against human albumin (diagram C) and human G-CSF (diagram B). The results in FIG. 6 indicate that the chimera HSA-GIy4-G.CSF (lane 3) is especially well secreted by Kluyveromyces yeast, possibly because the presence of the peptide linker between HSA portion and G-CSF portion is more favourable to an independent folding of these portions on transit of the chimera into the secretory pathway. Furthermore, the N-terminal fusion (G.CSF-Gly₄ -HSA) is also secreted by Kluyveromyces yeast (FIG. 6, lane 1).

EXAMPLE 6: PURIFICATION AND MOLECULAR CHARACTERIZATION OF THE SECRETED PRODUCTS

After centrifugation of a culture of the strain CBS 293.91 transformed with the expression plasmids according to Example 3, the culture supernatant is passed through a 0.22 mm filter (Millipore) and then concentrated by ultrafiltration (Amicon) using a membrane whose discrimination threshold lies at 30 kDa. The concentrate obtained is then adjusted to 50 mM Tris-HCl from a 1M stock solution of Tris-HCl (pH 6), and thereafter applied in 20-ml fractions to an ion exchange column (5 ml) (Q Fast Flow, Pharmacia) equilibrated in the same buffer. The chimeric protein is then eluted from the column with an NaCl gradient (0 to 1M). The fractions containing the chimeric protein are then pooled and dialysed against a 50 mM Tds-HCI solution (pH 6) and reapplied to a Q Fast Flow column (1 ml) equilibrated in the same buffer. After elution from the column, the fractions containing the protein are pooled, dialysed against water and lyophilized before characterization: for example, sequencing (Applied Biosystem) of the HSA-G.CSF protein secreted by the yeast CBS 293.91 gives the expected N-terminal sequence of HSA (Asp-Ala-His . . . ), demonstrating a correct maturation of the chimera on the immediately C-terminal side of the doublet of Arg--Arg residues of the HSA "pro" region (FIG. 1).

EXAMPLE 7: BIOLOGICAL ACTIVITY OF THE CHIMERAS BETWEEN HSA AND G-CSF E.7.1. In vitro biological activity

The chimeras purified according to Example 6 are tested for their capacity to permit in vitro proliferation of the IL3-dependent murine line NFS60, by measuring the incorporation of tritiated thymidine essentially according to the protocol described by Tsuchiya et al. [Proc. Natl. Acad. Sci. (1986) 83 7633]. For each chimera, measurements are carried out between 3 and 6 times in a three-point test (three dilutions of the product) in a region or the relationship between amount of active product and incorporation of labelled thymidine (Amersham) is linear. In each microtitration plate, the activity of a reference product consisting of recombinant human G-CSF expressed in mammalian cells is also systematically incorporated. The results in FIG. 7 demonstrate that the chimera HSA-G.CSF (pYG1266) secreted by Kluyveromyces yeast is capable in vitro of transducing a cell proliferation signal for the NFS60 line. In this particular case, the specific activity (cpm/molarity) of the chimera is approximately one seventh that of the reference G-CSF (not coupled).

E.7.2. In vivo activity

The stimulatory activity of the HSNG-CSF chimeras with respect to in vivo granulopoiesis is tested after subcutaneous injection in rats (Sprague-Dawley/CD, 250-300 g, 8-9 weeks) and compared with that of the reference G-CSF expressed from mammalian cells. Each product, tested on the basis of 7 animals, is injected subcutaneously into the dorsoscapular region on the basis of 100 ml over 7 consecutive days (D1-D7). 500 ml of blood are collected on days D-6, D2 (before the 2nd injection), D5 (before the 5th injection) and D8, and a blood count is performed. In this test, the specific activity (neutropoiesis units/mole injected) of the chimera HSA-G.CSF (pYG1266) is identical to that of the reference G-CSF (FIG. 8). Since this particular chimera possesses in vitro a specific activity one seventh that of the reference G-CSF (FIG. 7), it is hence demonstrated that the genetic coupling of G-CSF to HSA favourably modifies its pharmacokinetic properties.

    __________________________________________________________________________     SEQUENCE LISTING                                                               (1) GENERAL INFORMATION:                                                       (iii) NUMBER OF SEQUENCES: 12                                                  (2) INFORMATION FOR SEQ ID NO:1:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 2382 base pairs                                                    (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: cDNA                                                       (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: NO                                                            (ix) FEATURE:                                                                  (A) NAME/KEY: CDS                                                              (B) LOCATION: 26..2377                                                         (ix) FEATURE:                                                                  (A) NAME/KEY: misc.sub.-- feature                                              (B) LOCATION: 1842..1848                                                       (D) OTHER INFORMATION: /label=MstII-site                                       (ix) FEATURE:                                                                  (A) NAME/KEY: misc.sub.-- feature                                              (B) LOCATION: 1861..1866                                                       (D) OTHER INFORMATION: /label=ApaI-site                                        (ix) FEATURE:                                                                  (A) NAME/KEY: misc.sub.-- feature                                              (B) LOCATION: 2035..2040                                                       (D) OTHER INFORMATION: /label=SstI-site                                        (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:                                        AAGCTTTACAACAAATATAAAAACAATGAAGTGGGTAACCTTTATTTCCCTT52                         MetLysTrpValThrPheIleSerLeu                                                    15                                                                             CTTTTTCTCTTTAGCTCGGCTTATTCCAGGGGTGTGTTTCGTCGAGAT100                            LeuPheLeuPheSerSerAlaTyrSerArgGlyValPheArgArgAsp                               10152025                                                                       GCACACAAGAGTGAGGTTGCTCATCGGTTTAAAGATTTGGGAGAAGAA148                            AlaHisLysSerGluValAlaHisArgPheLysAspLeuGlyGluGlu                               303540                                                                         AATTTCAAAGCCTTGGTGTTGATTGCCTTTGCTCAGTATCTTCAGCAG196                            AsnPheLysAlaLeuValLeuIleAlaPheAlaGlnTyrLeuGlnGln                               455055                                                                         TGTCCATTTGAAGATCATGTAAAATTAGTGAATGAAGTAACTGAATTT244                            CysProPheGluAspHisValLysLeuValAsnGluValThrGluPhe                               606570                                                                         GCAAAAACATGTGTTGCTGATGAGTCAGCTGAAAATTGTGACAAATCA292                            AlaLysThrCysValAlaAspGluSerAlaGluAsnCysAspLysSer                               758085                                                                         CTTCATACCCTTTTTGGAGACAAATTATGCACAGTTGCAACTCTTCGT340                            LeuHisThrLeuPheGlyAspLysLeuCysThrValAlaThrLeuArg                               9095100105                                                                     GAAACCTATGGTGAAATGGCTGACTGCTGTGCAAAACAAGAACCTGAG388                            GluThrTyrGlyGluMetAlaAspCysCysAlaLysGlnGluProGlu                               110115120                                                                      AGAAATGAATGCTTCTTGCAACACAAAGATGACAACCCAAACCTCCCC436                            ArgAsnGluCysPheLeuGlnHisLysAspAspAsnProAsnLeuPro                               125130135                                                                      CGATTGGTGAGACCAGAGGTTGATGTGATGTGCACTGCTTTTCATGAC484                            ArgLeuValArgProGluValAspValMetCysThrAlaPheHisAsp                               140145150                                                                      AATGAAGAGACATTTTTGAAAAAATACTTATATGAAATTGCCAGAAGA532                            AsnGluGluThrPheLeuLysLysTyrLeuTyrGluIleAlaArgArg                               155160165                                                                      CATCCTTACTTTTATGCCCCGGAACTCCTTTTCTTTGCTAAAAGGTAT580                            HisProTyrPheTyrAlaProGluLeuLeuPhePheAlaLysArgTyr                               170175180185                                                                   AAAGCTGCTTTTACAGAATGTTGCCAAGCTGCTGATAAAGCTGCCTGC628                            LysAlaAlaPheThrGluCysCysGlnAlaAlaAspLysAlaAlaCys                               190195200                                                                      CTGTTGCCAAAGCTCGATGAACTTCGGGATGAAGGGAAGGCTTCGTCT676                            LeuLeuProLysLeuAspGluLeuArgAspGluGlyLysAlaSerSer                               205210215                                                                      GCCAAACAGAGACTCAAGTGTGCCAGTCTCCAAAAATTTGGAGAAAGA724                            AlaLysGlnArgLeuLysCysAlaSerLeuGlnLysPheGlyGluArg                               220225230                                                                      GCTTTCAAAGCATGGGCAGTAGCTCGCCTGAGCCAGAGATTTCCCAAA772                            AlaPheLysAlaTrpAlaValAlaArgLeuSerGlnArgPheProLys                               235240245                                                                      GCTGAGTTTGCAGAAGTTTCCAAGTTAGTGACAGATCTTACCAAAGTC820                            AlaGluPheAlaGluValSerLysLeuValThrAspLeuThrLysVal                               250255260265                                                                   CACACGGAATGCTGCCATGGAGATCTGCTTGAATGTGCTGATGACAGG868                            HisThrGluCysCysHisGlyAspLeuLeuGluCysAlaAspAspArg                               270275280                                                                      GCGGACCTTGCCAAGTATATCTGTGAAAATCAAGATTCGATCTCCAGT916                            AlaAspLeuAlaLysTyrIleCysGluAsnGlnAspSerIleSerSer                               285290295                                                                      AAACTGAAGGAATGCTGTGAAAAACCTCTGTTGGAAAAATCCCACTGC964                            LysLeuLysGluCysCysGluLysProLeuLeuGluLysSerHisCys                               300305310                                                                      ATTGCCGAAGTGGAAAATGATGAGATGCCTGCTGACTTGCCTTCATTA1012                           IleAlaGluValGluAsnAspGluMetProAlaAspLeuProSerLeu                               315320325                                                                      GCTGCTGATTTTGTTGAAAGTAAGGATGTTTGCAAAAACTATGCTGAG1060                           AlaAlaAspPheValGluSerLysAspValCysLysAsnTyrAlaGlu                               330335340345                                                                   GCAAAGGATGTCTTCCTGGGCATGTTTTTGTATGAATATGCAAGAAGG1108                           AlaLysAspValPheLeuGlyMetPheLeuTyrGluTyrAlaArgArg                               350355360                                                                      CATCCTGATTACTCTGTCGTACTGCTGCTGAGACTTGCCAAGACATAT1156                           HisProAspTyrSerValValLeuLeuLeuArgLeuAlaLysThrTyr                               365370375                                                                      GAAACCACTCTAGAGAAGTGCTGTGCCGCTGCAGATCCTCATGAATGC1204                           GluThrThrLeuGluLysCysCysAlaAlaAlaAspProHisGluCys                               380385390                                                                      TATGCCAAAGTGTTCGATGAATTTAAACCTCTTGTGGAAGAGCCTCAG1252                           TyrAlaLysValPheAspGluPheLysProLeuValGluGluProGln                               395400405                                                                      AATTTAATCAAACAAAATTGTGAGCTTTTTGAGCAGCTTGGAGAGTAC1300                           AsnLeuIleLysGlnAsnCysGluLeuPheGluGlnLeuGlyGluTyr                               410415420425                                                                   AAATTCCAGAATGCGCTATTAGTTCGTTACACCAAGAAAGTACCCCAA1348                           LysPheGlnAsnAlaLeuLeuValArgTyrThrLysLysValProGln                               430435440                                                                      GTGTCAACTCCAACTCTTGTAGAGGTCTCAAGAAACCTAGGAAAAGTG1396                           ValSerThrProThrLeuValGluValSerArgAsnLeuGlyLysVal                               445450455                                                                      GGCAGCAAATGTTGTAAACATCCTGAAGCAAAAAGAATGCCCTGTGCA1444                           GlySerLysCysCysLysHisProGluAlaLysArgMetProCysAla                               460465470                                                                      GAAGACTATCTATCCGTGGTCCTGAACCAGTTATGTGTGTTGCATGAG1492                           GluAspTyrLeuSerValValLeuAsnGlnLeuCysValLeuHisGlu                               475480485                                                                      AAAACGCCAGTAAGTGACAGAGTCACCAAATGCTGCACAGAATCCTTG1540                           LysThrProValSerAspArgValThrLysCysCysThrGluSerLeu                               490495500505                                                                   GTGAACAGGCGACCATGCTTTTCAGCTCTGGAAGTCGATGAAACATAC1588                           ValAsnArgArgProCysPheSerAlaLeuGluValAspGluThrTyr                               510515520                                                                      GTTCCCAAAGAGTTTAATGCTGAAACATTCACCTTCCATGCAGATATA1636                           ValProLysGluPheAsnAlaGluThrPheThrPheHisAlaAspIle                               525530535                                                                      TGCACACTTTCTGAGAAGGAGAGACAAATCAAGAAACAAACTGCACTT1684                           CysThrLeuSerGluLysGluArgGlnIleLysLysGlnThrAlaLeu                               540545550                                                                      GTTGAGCTTGTGAAACACAAGCCCAAGGCAACAAAAGAGCAACTGAAA1732                           ValGluLeuValLysHisLysProLysAlaThrLysGluGlnLeuLys                               555560565                                                                      GCTGTTATGGATGATTTCGCAGCTTTTGTAGAGAAGTGCTGCAAGGCT1780                           AlaValMetAspAspPheAlaAlaPheValGluLysCysCysLysAla                               570575580585                                                                   GACGATAAGGAGACCTGCTTTGCCGAGGAGGGTAAAAAACTTGTTGCT1828                           AspAspLysGluThrCysPheAlaGluGluGlyLysLysLeuValAla                               590595600                                                                      GCAAGTCAAGCTGCCTTAGGCTTAACCCCCCTGGGCCCTGCCAGCTCC1876                           AlaSerGlnAlaAlaLeuGlyLeuThrProLeuGlyProAlaSerSer                               605610615                                                                      CTGCCCCAGAGCTTCCTGCTCAAGTGCTTAGAGCAAGTGAGGAAGATC1924                           LeuProGlnSerPheLeuLeuLysCysLeuGluGlnValArgLysIle                               620625630                                                                      CAGGGCGATGGCGCAGCGCTCCAGGAGAAGCTGTGTGCCACCTACAAG1972                           GlnGlyAspGlyAlaAlaLeuGlnGluLysLeuCysAlaThrTyrLys                               635640645                                                                      CTGTGCCACCCCGAGGAGCTGGTGCTGCTCGGACACTCTCTGGGCATC2020                           LeuCysHisProGluGluLeuValLeuLeuGlyHisSerLeuGlyIle                               650655660665                                                                   CCCTGGGCTCCCCTGAGCTCCTGCCCCAGCCAGGCCCTGCAGCTGGCA2068                           ProTrpAlaProLeuSerSerCysProSerGlnAlaLeuGlnLeuAla                               670675680                                                                      GGCTGCTTGAGCCAACTCCATAGCGGCCTTTTCCTCTACCAGGGGCTC2116                           GlyCysLeuSerGlnLeuHisSerGlyLeuPheLeuTyrGlnGlyLeu                               685690695                                                                      CTGCAGGCCCTGGAAGGGATATCCCCCGAGTTGGGTCCCACCTTGGAC2164                           LeuGlnAlaLeuGluGlyIleSerProGluLeuGlyProThrLeuAsp                               700705710                                                                      ACACTGCAGCTGGACGTCGCCGACTTTGCCACCACCATCTGGCAGCAG2212                           ThrLeuGlnLeuAspValAlaAspPheAlaThrThrIleTrpGlnGln                               715720725                                                                      ATGGAAGAACTGGGAATGGCCCCTGCCCTGCAGCCCACCCAGGGTGCC2260                           MetGluGluLeuGlyMetAlaProAlaLeuGlnProThrGlnGlyAla                               730735740745                                                                   ATGCCGGCCTTCGCCTCTGCTTTCCAGCGCCGGGCAGGAGGGGTCCTG2308                           MetProAlaPheAlaSerAlaPheGlnArgArgAlaGlyGlyValLeu                               750755760                                                                      GTTGCTAGCCATCTGCAGAGCTTCCTGGAGGTGTCGTACCGCGTTCTA2356                           ValAlaSerHisLeuGlnSerPheLeuGluValSerTyrArgValLeu                               765770775                                                                      CGCCACCTTGCGCAGCCCTGAAGCTT2382                                                 ArgHisLeuAlaGlnPro                                                             780                                                                            (2) INFORMATION FOR SEQ ID NO:2:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 783 amino acids                                                    (B) TYPE: amino acid                                                           (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: protein                                                    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:                                        MetLysTrpValThrPheIleSerLeuLeuPheLeuPheSerSerAla                               151015                                                                         TyrSerArgGlyValPheArgArgAspAlaHisLysSerGluValAla                               202530                                                                         HisArgPheLysAspLeuGlyGluGluAsnPheLysAlaLeuValLeu                               354045                                                                         IleAlaPheAlaGlnTyrLeuGlnGlnCysProPheGluAspHisVal                               505560                                                                         LysLeuValAsnGluValThrGluPheAlaLysThrCysValAlaAsp                               65707580                                                                       GluSerAlaGluAsnCysAspLysSerLeuHisThrLeuPheGlyAsp                               859095                                                                         LysLeuCysThrValAlaThrLeuArgGluThrTyrGlyGluMetAla                               100105110                                                                      AspCysCysAlaLysGlnGluProGluArgAsnGluCysPheLeuGln                               115120125                                                                      HisLysAspAspAsnProAsnLeuProArgLeuValArgProGluVal                               130135140                                                                      AspValMetCysThrAlaPheHisAspAsnGluGluThrPheLeuLys                               145150155160                                                                   LysTyrLeuTyrGluIleAlaArgArgHisProTyrPheTyrAlaPro                               165170175                                                                      GluLeuLeuPhePheAlaLysArgTyrLysAlaAlaPheThrGluCys                               180185190                                                                      CysGlnAlaAlaAspLysAlaAlaCysLeuLeuProLysLeuAspGlu                               195200205                                                                      LeuArgAspGluGlyLysAlaSerSerAlaLysGlnArgLeuLysCys                               210215220                                                                      AlaSerLeuGlnLysPheGlyGluArgAlaPheLysAlaTrpAlaVal                               225230235240                                                                   AlaArgLeuSerGlnArgPheProLysAlaGluPheAlaGluValSer                               245250255                                                                      LysLeuValThrAspLeuThrLysValHisThrGluCysCysHisGly                               260265270                                                                      AspLeuLeuGluCysAlaAspAspArgAlaAspLeuAlaLysTyrIle                               275280285                                                                      CysGluAsnGlnAspSerIleSerSerLysLeuLysGluCysCysGlu                               290295300                                                                      LysProLeuLeuGluLysSerHisCysIleAlaGluValGluAsnAsp                               305310315320                                                                   GluMetProAlaAspLeuProSerLeuAlaAlaAspPheValGluSer                               325330335                                                                      LysAspValCysLysAsnTyrAlaGluAlaLysAspValPheLeuGly                               340345350                                                                      MetPheLeuTyrGluTyrAlaArgArgHisProAspTyrSerValVal                               355360365                                                                      LeuLeuLeuArgLeuAlaLysThrTyrGluThrThrLeuGluLysCys                               370375380                                                                      CysAlaAlaAlaAspProHisGluCysTyrAlaLysValPheAspGlu                               385390395400                                                                   PheLysProLeuValGluGluProGlnAsnLeuIleLysGlnAsnCys                               405410415                                                                      GluLeuPheGluGlnLeuGlyGluTyrLysPheGlnAsnAlaLeuLeu                               420425430                                                                      ValArgTyrThrLysLysValProGlnValSerThrProThrLeuVal                               435440445                                                                      GluValSerArgAsnLeuGlyLysValGlySerLysCysCysLysHis                               450455460                                                                      ProGluAlaLysArgMetProCysAlaGluAspTyrLeuSerValVal                               465470475480                                                                   LeuAsnGlnLeuCysValLeuHisGluLysThrProValSerAspArg                               485490495                                                                      ValThrLysCysCysThrGluSerLeuValAsnArgArgProCysPhe                               500505510                                                                      SerAlaLeuGluValAspGluThrTyrValProLysGluPheAsnAla                               515520525                                                                      GluThrPheThrPheHisAlaAspIleCysThrLeuSerGluLysGlu                               530535540                                                                      ArgGlnIleLysLysGlnThrAlaLeuValGluLeuValLysHisLys                               545550555560                                                                   ProLysAlaThrLysGluGlnLeuLysAlaValMetAspAspPheAla                               565570575                                                                      AlaPheValGluLysCysCysLysAlaAspAspLysGluThrCysPhe                               580585590                                                                      AlaGluGluGlyLysLysLeuValAlaAlaSerGlnAlaAlaLeuGly                               595600605                                                                      LeuThrProLeuGlyProAlaSerSerLeuProGlnSerPheLeuLeu                               610615620                                                                      LysCysLeuGluGlnValArgLysIleGlnGlyAspGlyAlaAlaLeu                               625630635640                                                                   GlnGluLysLeuCysAlaThrTyrLysLeuCysHisProGluGluLeu                               645650655                                                                      ValLeuLeuGlyHisSerLeuGlyIleProTrpAlaProLeuSerSer                               660665670                                                                      CysProSerGlnAlaLeuGlnLeuAlaGlyCysLeuSerGlnLeuHis                               675680685                                                                      SerGlyLeuPheLeuTyrGlnGlyLeuLeuGlnAlaLeuGluGlyIle                               690695700                                                                      SerProGluLeuGlyProThrLeuAspThrLeuGlnLeuAspValAla                               705710715720                                                                   AspPheAlaThrThrIleTrpGlnGlnMetGluGluLeuGlyMetAla                               725730735                                                                      ProAlaLeuGlnProThrGlnGlyAlaMetProAlaPheAlaSerAla                               740745750                                                                      PheGlnArgArgAlaGlyGlyValLeuValAlaSerHisLeuGlnSer                               755760765                                                                      PheLeuGluValSerTyrArgValLeuArgHisLeuAlaGlnPro                                  770775780                                                                      (2) INFORMATION FOR SEQ ID NO:3:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 2455 base pairs                                                    (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: cDNA                                                       (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: NO                                                            (ix) FEATURE:                                                                  (A) NAME/KEY: CDS                                                              (B) LOCATION: 26..2389                                                         (ix) FEATURE:                                                                  (A) NAME/KEY: misc.sub.-- feature                                              (B) LOCATION: 106..111                                                         (D) OTHER INFORMATION: /label=ApaI-site                                        (ix) FEATURE:                                                                  (A) NAME/KEY: misc.sub.-- feature                                              (B) LOCATION: 280..285                                                         (D) OTHER INFORMATION: /label=SstI-site                                        (ix) FEATURE:                                                                  (A) NAME/KEY: misc.sub.-- feature                                              (B) LOCATION: 2376..2382                                                       (D) OTHER INFORMATION: /label=MstII-site                                       (ix) FEATURE:                                                                  (A) NAME/KEY: sig.sub.-- peptide                                               (B) LOCATION: 26..97                                                           (ix) FEATURE:                                                                  (A) NAME/KEY: misc.sub.-- feature                                              (B) LOCATION: 620..631                                                         (D) OTHER INFORMATION: /label=polyGly-linker                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:                                        AAGCTTTACAACAAATATAAAAACAATGAAGTGGGTAACCTTTATTTCCCTT52                         MetLysTrpValThrPheIleSerLeu                                                    15                                                                             CTTTTTCTCTTTAGCTCGGCTTATTCCAGGGGTGTGTTTCGTCGAACC100                            LeuPheLeuPheSerSerAlaTyrSerArgGlyValPheArgArgThr                               10152025                                                                       CCCCTGGGCCCTGCCAGCTCCCTGCCCCAGAGCTTCCTGCTCAAGTGC148                            ProLeuGlyProAlaSerSerLeuProGlnSerPheLeuLeuLysCys                               303540                                                                         TTAGAGCAAGTGAGGAAGATCCAGGGCGATGGCGCAGCGCTCCAGGAG196                            LeuGluGlnValArgLysIleGlnGlyAspGlyAlaAlaLeuGlnGlu                               455055                                                                         AAGCTGTGTGCCACCTACAAGCTGTGCCACCCCGAGGAGCTGGTGCTG244                            LysLeuCysAlaThrTyrLysLeuCysHisProGluGluLeuValLeu                               606570                                                                         CTCGGACACTCTCTGGGCATCCCCTGGGCTCCCCTGAGCTCCTGCCCC292                            LeuGlyHisSerLeuGlyIleProTrpAlaProLeuSerSerCysPro                               758085                                                                         AGCCAGGCCCTGCAGCTGGCAGGCTGCTTGAGCCAACTCCATAGCGGC340                            SerGlnAlaLeuGlnLeuAlaGlyCysLeuSerGlnLeuHisSerGly                               9095100105                                                                     CTTTTCCTCTACCAGGGGCTCCTGCAGGCCCTGGAAGGGATATCCCCC388                            LeuPheLeuTyrGlnGlyLeuLeuGlnAlaLeuGluGlyIleSerPro                               110115120                                                                      GAGTTGGGTCCCACCTTGGACACACTGCAGCTGGACGTCGCCGACTTT436                            GluLeuGlyProThrLeuAspThrLeuGlnLeuAspValAlaAspPhe                               125130135                                                                      GCCACCACCATCTGGCAGCAGATGGAAGAACTGGGAATGGCCCCTGCC484                            AlaThrThrIleTrpGlnGlnMetGluGluLeuGlyMetAlaProAla                               140145150                                                                      CTGCAGCCCACCCAGGGTGCCATGCCGGCCTTCGCCTCTGCTTTCCAG532                            LeuGlnProThrGlnGlyAlaMetProAlaPheAlaSerAlaPheGln                               155160165                                                                      CGCCGGGCAGGAGGGGTCCTGGTTGCTAGCCATCTGCAGAGCTTCCTG580                            ArgArgAlaGlyGlyValLeuValAlaSerHisLeuGlnSerPheLeu                               170175180185                                                                   GAGGTGTCGTACCGCGTTCTACGCCACCTTGCGCAGCCCGGTGGAGGC628                            GluValSerTyrArgValLeuArgHisLeuAlaGlnProGlyGlyGly                               190195200                                                                      GGTGATGCACACAAGAGTGAGGTTGCTCATCGGTTTAAAGATTTGGGA676                            GlyAspAlaHisLysSerGluValAlaHisArgPheLysAspLeuGly                               205210215                                                                      GAAGAAAATTTCAAAGCCTTGGTGTTGATTGCCTTTGCTCAGTATCTT724                            GluGluAsnPheLysAlaLeuValLeuIleAlaPheAlaGlnTyrLeu                               220225230                                                                      CAGCAGTGTCCATTTGAAGATCATGTAAAATTAGTGAATGAAGTAACT772                            GlnGlnCysProPheGluAspHisValLysLeuValAsnGluValThr                               235240245                                                                      GAATTTGCAAAAACATGTGTTGCTGATGAGTCAGCTGAAAATTGTGAC820                            GluPheAlaLysThrCysValAlaAspGluSerAlaGluAsnCysAsp                               250255260265                                                                   AAATCACTTCATACCCTTTTTGGAGACAAATTATGCACAGTTGCAACT868                            LysSerLeuHisThrLeuPheGlyAspLysLeuCysThrValAlaThr                               270275280                                                                      CTTCGTGAAACCTATGGTGAAATGGCTGACTGCTGTGCAAAACAAGAA916                            LeuArgGluThrTyrGlyGluMetAlaAspCysCysAlaLysGlnGlu                               285290295                                                                      CCTGAGAGAAATGAATGCTTCTTGCAACACAAAGATGACAACCCAAAC964                            ProGluArgAsnGluCysPheLeuGlnHisLysAspAspAsnProAsn                               300305310                                                                      CTCCCCCGATTGGTGAGACCAGAGGTTGATGTGATGTGCACTGCTTTT1012                           LeuProArgLeuValArgProGluValAspValMetCysThrAlaPhe                               315320325                                                                      CATGACAATGAAGAGACATTTTTGAAAAAATACTTATATGAAATTGCC1060                           HisAspAsnGluGluThrPheLeuLysLysTyrLeuTyrGluIleAla                               330335340345                                                                   AGAAGACATCCTTACTTTTATGCCCCGGAACTCCTTTTCTTTGCTAAA1108                           ArgArgHisProTyrPheTyrAlaProGluLeuLeuPhePheAlaLys                               350355360                                                                      AGGTATAAAGCTGCTTTTACAGAATGTTGCCAAGCTGCTGATAAAGCT1156                           ArgTyrLysAlaAlaPheThrGluCysCysGlnAlaAlaAspLysAla                               365370375                                                                      GCCTGCCTGTTGCCAAAGCTCGATGAACTTCGGGATGAAGGGAAGGCT1204                           AlaCysLeuLeuProLysLeuAspGluLeuArgAspGluGlyLysAla                               380385390                                                                      TCGTCTGCCAAACAGAGACTCAAGTGTGCCAGTCTCCAAAAATTTGGA1252                           SerSerAlaLysGlnArgLeuLysCysAlaSerLeuGlnLysPheGly                               395400405                                                                      GAAAGAGCTTTCAAAGCATGGGCAGTAGCTCGCCTGAGCCAGAGATTT1300                           GluArgAlaPheLysAlaTrpAlaValAlaArgLeuSerGlnArgPhe                               410415420425                                                                   CCCAAAGCTGAGTTTGCAGAAGTTTCCAAGTTAGTGACAGATCTTACC1348                           ProLysAlaGluPheAlaGluValSerLysLeuValThrAspLeuThr                               430435440                                                                      AAAGTCCACACGGAATGCTGCCATGGAGATCTGCTTGAATGTGCTGAT1396                           LysValHisThrGluCysCysHisGlyAspLeuLeuGluCysAlaAsp                               445450455                                                                      GACAGGGCGGACCTTGCCAAGTATATCTGTGAAAATCAAGATTCGATC1444                           AspArgAlaAspLeuAlaLysTyrIleCysGluAsnGlnAspSerIle                               460465470                                                                      TCCAGTAAACTGAAGGAATGCTGTGAAAAACCTCTGTTGGAAAAATCC1492                           SerSerLysLeuLysGluCysCysGluLysProLeuLeuGluLysSer                               475480485                                                                      CACTGCATTGCCGAAGTGGAAAATGATGAGATGCCTGCTGACTTGCCT1540                           HisCysIleAlaGluValGluAsnAspGluMetProAlaAspLeuPro                               490495500505                                                                   TCATTAGCTGCTGATTTTGTTGAAAGTAAGGATGTTTGCAAAAACTAT1588                           SerLeuAlaAlaAspPheValGluSerLysAspValCysLysAsnTyr                               510515520                                                                      GCTGAGGCAAAGGATGTCTTCCTGGGCATGTTTTTGTATGAATATGCA1636                           AlaGluAlaLysAspValPheLeuGlyMetPheLeuTyrGluTyrAla                               525530535                                                                      AGAAGGCATCCTGATTACTCTGTCGTACTGCTGCTGAGACTTGCCAAG1684                           ArgArgHisProAspTyrSerValValLeuLeuLeuArgLeuAlaLys                               540545550                                                                      ACATATGAAACCACTCTAGAGAAGTGCTGTGCCGCTGCAGATCCTCAT1732                           ThrTyrGluThrThrLeuGluLysCysCysAlaAlaAlaAspProHis                               555560565                                                                      GAATGCTATGCCAAAGTGTTCGATGAATTTAAACCTCTTGTGGAAGAG1780                           GluCysTyrAlaLysValPheAspGluPheLysProLeuValGluGlu                               570575580585                                                                   CCTCAGAATTTAATCAAACAAAATTGTGAGCTTTTTGAGCAGCTTGGA1828                           ProGlnAsnLeuIleLysGlnAsnCysGluLeuPheGluGlnLeuGly                               590595600                                                                      GAGTACAAATTCCAGAATGCGCTATTAGTTCGTTACACCAAGAAAGTA1876                           GluTyrLysPheGlnAsnAlaLeuLeuValArgTyrThrLysLysVal                               605610615                                                                      CCCCAAGTGTCAACTCCAACTCTTGTAGAGGTCTCAAGAAACCTAGGA1924                           ProGlnValSerThrProThrLeuValGluValSerArgAsnLeuGly                               620625630                                                                      AAAGTGGGCAGCAAATGTTGTAAACATCCTGAAGCAAAAAGAATGCCC1972                           LysValGlySerLysCysCysLysHisProGluAlaLysArgMetPro                               635640645                                                                      TGTGCAGAAGACTATCTATCCGTGGTCCTGAACCAGTTATGTGTGTTG2020                           CysAlaGluAspTyrLeuSerValValLeuAsnGlnLeuCysValLeu                               650655660665                                                                   CATGAGAAAACGCCAGTAAGTGACAGAGTCACCAAATGCTGCACAGAA2068                           HisGluLysThrProValSerAspArgValThrLysCysCysThrGlu                               670675680                                                                      TCCTTGGTGAACAGGCGACCATGCTTTTCAGCTCTGGAAGTCGATGAA2116                           SerLeuValAsnArgArgProCysPheSerAlaLeuGluValAspGlu                               685690695                                                                      ACATACGTTCCCAAAGAGTTTAATGCTGAAACATTCACCTTCCATGCA2164                           ThrTyrValProLysGluPheAsnAlaGluThrPheThrPheHisAla                               700705710                                                                      GATATATGCACACTTTCTGAGAAGGAGAGACAAATCAAGAAACAAACT2212                           AspIleCysThrLeuSerGluLysGluArgGlnIleLysLysGlnThr                               715720725                                                                      GCACTTGTTGAGCTTGTGAAACACAAGCCCAAGGCAACAAAAGAGCAA2260                           AlaLeuValGluLeuValLysHisLysProLysAlaThrLysGluGln                               730735740745                                                                   CTGAAAGCTGTTATGGATGATTTCGCAGCTTTTGTAGAGAAGTGCTGC2308                           LeuLysAlaValMetAspAspPheAlaAlaPheValGluLysCysCys                               750755760                                                                      AAGGCTGACGATAAGGAGACCTGCTTTGCCGAGGAGGGTAAAAAACTT2356                           LysAlaAspAspLysGluThrCysPheAlaGluGluGlyLysLysLeu                               765770775                                                                      GTTGCTGCAAGTCAAGCTGCCTTAGGCTTATAACATCACATTTAAAAGCA2406                         ValAlaAlaSerGlnAlaAlaLeuGlyLeu                                                 780785                                                                         TCTCAGCCTACCATGAGAATAAGAGAAAGAAAATGAAGATCAAAAGCTT2455                          (2) INFORMATION FOR SEQ ID NO:4:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 787 amino acids                                                    (B) TYPE: amino acid                                                           (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: protein                                                    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:                                        MetLysTrpValThrPheIleSerLeuLeuPheLeuPheSerSerAla                               151015                                                                         TyrSerArgGlyValPheArgArgThrProLeuGlyProAlaSerSer                               202530                                                                         LeuProGlnSerPheLeuLeuLysCysLeuGluGlnValArgLysIle                               354045                                                                         GlnGlyAspGlyAlaAlaLeuGlnGluLysLeuCysAlaThrTyrLys                               505560                                                                         LeuCysHisProGluGluLeuValLeuLeuGlyHisSerLeuGlyIle                               65707580                                                                       ProTrpAlaProLeuSerSerCysProSerGlnAlaLeuGlnLeuAla                               859095                                                                         GlyCysLeuSerGlnLeuHisSerGlyLeuPheLeuTyrGlnGlyLeu                               100105110                                                                      LeuGlnAlaLeuGluGlyIleSerProGluLeuGlyProThrLeuAsp                               115120125                                                                      ThrLeuGlnLeuAspValAlaAspPheAlaThrThrIleTrpGlnGln                               130135140                                                                      MetGluGluLeuGlyMetAlaProAlaLeuGlnProThrGlnGlyAla                               145150155160                                                                   MetProAlaPheAlaSerAlaPheGlnArgArgAlaGlyGlyValLeu                               165170175                                                                      ValAlaSerHisLeuGlnSerPheLeuGluValSerTyrArgValLeu                               180185190                                                                      ArgHisLeuAlaGlnProGlyGlyGlyGlyAspAlaHisLysSerGlu                               195200205                                                                      ValAlaHisArgPheLysAspLeuGlyGluGluAsnPheLysAlaLeu                               210215220                                                                      ValLeuIleAlaPheAlaGlnTyrLeuGlnGlnCysProPheGluAsp                               225230235240                                                                   HisValLysLeuValAsnGluValThrGluPheAlaLysThrCysVal                               245250255                                                                      AlaAspGluSerAlaGluAsnCysAspLysSerLeuHisThrLeuPhe                               260265270                                                                      GlyAspLysLeuCysThrValAlaThrLeuArgGluThrTyrGlyGlu                               275280285                                                                      MetAlaAspCysCysAlaLysGlnGluProGluArgAsnGluCysPhe                               290295300                                                                      LeuGlnHisLysAspAspAsnProAsnLeuProArgLeuValArgPro                               305310315320                                                                   GluValAspValMetCysThrAlaPheHisAspAsnGluGluThrPhe                               325330335                                                                      LeuLysLysTyrLeuTyrGluIleAlaArgArgHisProTyrPheTyr                               340345350                                                                      AlaProGluLeuLeuPhePheAlaLysArgTyrLysAlaAlaPheThr                               355360365                                                                      GluCysCysGlnAlaAlaAspLysAlaAlaCysLeuLeuProLysLeu                               370375380                                                                      AspGluLeuArgAspGluGlyLysAlaSerSerAlaLysGlnArgLeu                               385390395400                                                                   LysCysAlaSerLeuGlnLysPheGlyGluArgAlaPheLysAlaTrp                               405410415                                                                      AlaValAlaArgLeuSerGlnArgPheProLysAlaGluPheAlaGlu                               420425430                                                                      ValSerLysLeuValThrAspLeuThrLysValHisThrGluCysCys                               435440445                                                                      HisGlyAspLeuLeuGluCysAlaAspAspArgAlaAspLeuAlaLys                               450455460                                                                      TyrIleCysGluAsnGlnAspSerIleSerSerLysLeuLysGluCys                               465470475480                                                                   CysGluLysProLeuLeuGluLysSerHisCysIleAlaGluValGlu                               485490495                                                                      AsnAspGluMetProAlaAspLeuProSerLeuAlaAlaAspPheVal                               500505510                                                                      GluSerLysAspValCysLysAsnTyrAlaGluAlaLysAspValPhe                               515520525                                                                      LeuGlyMetPheLeuTyrGluTyrAlaArgArgHisProAspTyrSer                               530535540                                                                      ValValLeuLeuLeuArgLeuAlaLysThrTyrGluThrThrLeuGlu                               545550555560                                                                   LysCysCysAlaAlaAlaAspProHisGluCysTyrAlaLysValPhe                               565570575                                                                      AspGluPheLysProLeuValGluGluProGlnAsnLeuIleLysGln                               580585590                                                                      AsnCysGluLeuPheGluGlnLeuGlyGluTyrLysPheGlnAsnAla                               595600605                                                                      LeuLeuValArgTyrThrLysLysValProGlnValSerThrProThr                               610615620                                                                      LeuValGluValSerArgAsnLeuGlyLysValGlySerLysCysCys                               625630635640                                                                   LysHisProGluAlaLysArgMetProCysAlaGluAspTyrLeuSer                               645650655                                                                      ValValLeuAsnGlnLeuCysValLeuHisGluLysThrProValSer                               660665670                                                                      AspArgValThrLysCysCysThrGluSerLeuValAsnArgArgPro                               675680685                                                                      CysPheSerAlaLeuGluValAspGluThrTyrValProLysGluPhe                               690695700                                                                      AsnAlaGluThrPheThrPheHisAlaAspIleCysThrLeuSerGlu                               705710715720                                                                   LysGluArgGlnIleLysLysGlnThrAlaLeuValGluLeuValLys                               725730735                                                                      HisLysProLysAlaThrLysGluGlnLeuLysAlaValMetAspAsp                               740745750                                                                      PheAlaAlaPheValGluLysCysCysLysAlaAspAspLysGluThr                               755760765                                                                      CysPheAlaGluGluGlyLysLysLeuValAlaAlaSerGlnAlaAla                               770775780                                                                      LeuGlyLeu                                                                      785                                                                            (2) INFORMATION FOR SEQ ID NO:5:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 13 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: other nucleic acid                                         (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:                                        GGCCNNNNNGGCC13                                                                (2) INFORMATION FOR SEQ ID NO:6:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 38 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: other nucleic acid                                         (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:                                        CAAGGATCCAAGCTTCAGGGCTGCGCAAGGTGGCGTAG38                                       (2) INFORMATION FOR SEQ ID NO:7:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 39 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: other nucleic acid                                         (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:                                        CGGGGTACCTTAGGCTTAACCCCCCTGGGCCCTGCCAGC39                                      (2) INFORMATION FOR SEQ ID NO:8:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 34 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: other nucleic acid                                         (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:                                        TTAGGCTTAGGTGGTGGCGGTACCCCCCTGGGCC34                                           (2) INFORMATION FOR SEQ ID NO:9:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 27 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: other nucleic acid                                         (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:                                        CAGGGGGGTACCGCCACCACCTAAGCC27                                                  (2) INFORMATION FOR SEQ ID NO:10:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 66 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: other nucleic acid                                         (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:                                       GTTCTACGCCACCTTGCGCAGCCCGGTGGAGGCGGTGATGCACACAAGAGTGAGGTTGCT60                 CATCGG66                                                                       (2) INFORMATION FOR SEQ ID NO:11:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 60 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: other nucleic acid                                         (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:                                       CAGGGAGCTGGCAGGGCCCAGGGGGGTTCGACGAAACACACCCCTGGAATAAGCCGAGCT60                 (2) INFORMATION FOR SEQ ID NO:12:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 30 base pairs                                                      (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: other nucleic acid                                         (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:                                       GAAATGCATAAGCTCTTGCCATTCTCACCG30                                               __________________________________________________________________________ 

I claim:
 1. Recombinant polypeptide comprising G-CSF coupled to albumin or a natural variant of albumin, wherein said G-CSF comprises residues Thr586-Pro759 of the sequence given in FIG. 1 (SEQ ID NO:2 residues Thr610-Pro783).
 2. Polypeptide according to claim 1 wherein the G-CSF is coupled to the C-terminal end of the albumin or natural variant of albumin.
 3. Polypeptide according to claim 1 wherein the G-CSF is coupled to the N-terminal end of the albumin or natural variant of albumin.
 4. Nucleic acid coding for a polypeptide comprising G-CSF coupled to albumin or a natural variant of albumin, wherein said G-CSF comprises residues Thr586-Pro759 of the sequence given in FIG. 1 (SEQ ID NO:2 residues Thr610-Pro783).
 5. Nucleic Acid according to claim 4 comprising a nucleotide sequence creating a leader sequence enabling the polypeptide expressed to be secreted.
 6. Expression cassette comprising nucleic acid according to claim 4 operably linked to a transcription initiation region.
 7. Self-replicating plasmid containing an expression cassette according to claim
 6. 8. Recombinant eukaryotic or prokaryotic cell comprising a nucleic acid according to claim
 4. 9. Recombinant cell according to claim 8, selected from the group consisting of a yeast, an animal cell, a fungus and a bacterium.
 10. Recombinant cell according to claim 9, wherein said cell is a yeast.
 11. Recombinant cell according to claim 10, wherein said yeast is of the genus Saccharomyces or Kluyveromyces.
 12. Method for preparing a polypeptide comprising culturing a recombinant cell according to claim 8 under conditions for expression.
 13. Pharmaceutical composition comprising one or more polypeptides according to claim 1 in a pharmaceutically effective vehicle. 